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Abstract 

This study aims to explore the impact of learner grade, visual cueing, and control design on children’s reading 
achievement of audio e-books with tablet computers. This research was a three-way factorial design where the first 
factor was learner grade (grade four and six), the second factor was e-book visual cueing (word-based, line-based, 
and paragraph-based cueing), and the third factor was e-book control mechanism (system-controlled and 
learner-controlled). The e-books used in this study were on the topic of six major classes of nutrients and the content 
was created with Adobe Flash CS 5.5. This research was guided by this question: Is there any interaction among 
learner grade, visual cueing, and learner control on children’s reading of audio e-books in terms of learning 
achievement (recall and transfer)? A sample of one hundred and eighty-five fourth and sixth graders joined the study, 
and participants were assigned into one of these groups randomly. The results showed significant interactions among 
these three factors. The fourth graders found the paragraph-based cueing and system-controlled mechanism 
advantageous; however, the sixth graders performed equally well under all different conditions of interface design. 
The results indicated that the fourth and sixth graders have different characteristics as readers and, by extension, 
suggest audio e-book design should consider the target readers’ cognitive and psychological development. 

Keywords: elementary’ education; human-computer interface; visual-cueing; audio e-books 


1. Introduction 

The use of e-books with tablets for educational purposes has become widespread globally, and the adoption of the 
technological features has the potential to alter the educational process of teaching and learning (Diindara & 
Ak?ayirb, 2014; Felvegi & Matthew, 2012). Tablet computers make it possible for students to hold libraries in their 
hands, and provide students with opportunities to read in different locations at any time with diverse media (Falloon, 
2013; Felvegi & Matthew, 2012; Schreus, 2013). The reading of e-books on tablets has transformed the entire 
reading process, moving from static printed text to highly interactive text in a variety of formats, software and 
display options (Felvegi & Matthew, 2012). Tablet-based e-books support children’s emerging literacy development 
by extending the connections elementary students make with the text through touch-screen manipulation and 
text-to-speech format. As a result, it is recommended that school teachers use tablets with true curricular integration 
rather than as add-ons to instruction (Hutchison, Beschorner, & Schmidt-Crawford, 2012; Larson, 2010; Moody, 
2010). A few studies have begun to elaborate on the features of tablet-based e-books that may especially influence 
children’s literacy skills, including font size manipulation, text-to-speech tools, dictionaries, automatic page turning, 
and animation hotspots. However, some researchers have pointed out that low-quality e-books offered distracting 
digital features such as overloaded animation, sound unrelated to the story, or inappropriate text-to-speech features 
(Miller & Warschauer, 2013; Moody, 2010). 

The text-to-speech feature is one advantageous trait of tablet-based e-books over traditional print sources since 
reading aloud has been found to be one of the most effective forms of teaching children to read (Beck & Mckeown, 
2001). The provision of both audio and textual information is a useful technique in promoting self-learning (Gibson, 
2008). Although people often believe that using spoken information simultaneously with the same written text is 
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beneficial for learning, available evidence indicates that learning could be inhibited by the presentation of the same 
verbal information in both modalities (Kalyuga, Chandler, & Sweller, 1999). This is known as the verbal redundancy 
effect, which refers to the restrained situation of learning when there is the simultaneous presentation of text and 
narration with identical words (Kalyuga, 2012). Researchers argued that verbal redundancy effects may occur when 
learners are required to integrate sources of audio and textual information (Kalyuga, 2012; Kalyuga, Chandler, & 
Sweller, 1999; Moreno & Mayer, 2002). Several strategies, such as asynchronous presentation of sources, use of 
visual cueing, and learner-controlled design, were presented to solve the verbal redundancy problem to keep the 
text-to-speech feature working better (Kalyuga, Chandler, & Sweller, 2004; Kalyuga, 2012). These strategies play 
important roles in the audio-visual design of e-books; however, few studies have taken an integrative viewpoint by 
considering more than one strategy. Moreover, limited studies focus on the design of the text-to-speech feature for 
different ages of young children (Moody, 2010). 

This study aims to understand how certain strategies work on the text-to-speech design of audio e-books. The goal of 
this research is to investigate the impact of visual cueing and control design on the reading of audio books on tablet 
computers by children in different grades. This study picked one unit from a children’s textbook in health education, 
Six Major Classes of Nutrients, and reworked the content into an audio e-book. This research used e-books as the 
media for content delivery because e-books have become extensively popular across a range of education levels and 
subject domains, and studies claim that the multimedia features in e-books help children improve the reading 
comprehension skills (Grimshaw, Dungworth & Mcknight, 2007). The goal of this study is to compare learning 
achievement among e-books with different visual cueing strategy design (word-based, line-based, and 
paragraph-based) and control mechanisms (system-controlled and learner-controlled) for the fourth and sixth graders, 
and considers how children’s grade level influences the use of audio books. The research question is: Is there any 
interaction among grade, visual cueing, and learner control on children’s reading of audio books in terms of learning 
achievement (recall and transfer)? A brief theoretical background is presented in the following section to better 
understand the relative factors before diving into the experimental design to address this question. 


2. Theoretical Background 

2.1 Verbal Redundancy Effect 

The design of audio e-books’ text-to-speech feature is closely related to the dual processing model theory (Paivio, 
1971), and previous empirical studies on multimedia modalities could serve as important references (Kalyuga, 2012; 
Mayer, 2009). A dual-processing model of memory (also known as the dual coding theory) considers working 
memory capacities to be distributed over separate verbal (including speech and text) and visual channels (Paivio, 
1971; 1986). Baddeley (1986) then expanded the dual coding theory by dividing the working memory into a 
phonological loop and a visual-spatial sketchpad. The phonological loop processes auditory information (including 
visually presented language) whereas the visual-spatial sketchpad deals with visual information such as diagrams and 
pictures. Since speech and text go into the same verbal channel, learners might be cognitively overloaded if the two 
sources of information are presented at the same time (Kalyuga, Chandler, & Sweller, 2004). When reading an audio 
e-book, learners may not be able to temporally and spatially synchronize the media and may not be able to determine 
exactly what within the visual content and the sound content is important. In such cases, not only will the auditory 
and visual channels fail to interact but learners will also have to use a considerable amount of cognitive resources to 
determine where the audio and texts should be synchronized. As a result, a considerable amount of content 
comprehension is lost (Kalyuga, 2012). This is known as verbal redundancy effect and several studies have reported 
the existence of this effect. Kalyuga, Chandler and Sweller (1999; 2004) used lengthy technical textual materials 
without any visual aids included to demonstrate a verbal redundancy effect. They observed that the addition of 
concurrent audio explanations to texts had a negative effect on learning. Mayer (2009) argued that concurrent 
on-screen texts animations and narrations overloaded the verbal channel because of competition between the two 
verbal sources (texts and audio) for cognitive resources. Although processing both auditory and textual material may 
overload working memory and decrease learning outcomes, researchers have been looking for approaches to improve 
audio-visual design and optimize the integration of information through different modalities. Several strategies were 
presented to solve the verbal redundancy problem to direct learner attention and improve information integration, 
thus causing the text-to-speech feature to work better (Kalyuga, Chandler, & Sweller, 2004; Kalyuga, 2012). One 
common solution to this problem is to use visual cueing. The following section discusses the use of visual cueing in 
the multimedia learning environment. 
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2.2 Visual Cueing 

Visual cueing refers to the addition of non-content information (eg. pointers, arrows, circles, and colors) to visual 
representations (Ando & Ueno, 2008; Lin & Atkinson, 2011). Visual cues highlight the key points of instructional 
materials and thus guide attention, reducing visual search and extraneous cognitive load and enhancing source 
integration in a multimedia learning environment (deKoning, Tabbers, Rikers, & Paas, 2010). From a cognitive load 
perspective, visual cueing is an effective method to reduce extraneous load in the multimedia learning environment 
and is helpful to the processing of audio and text. Several studies have supported the instructional benefits of visual 
cueing (de Koning, Tabbers, Rikers, & Pass, 2010; Jamet, Gavota, & Quaireau, 2008; Lin & Atkinson, 2011). 
Researchers have argued that learners had higher comprehension and transfer scores in a cued animation with 
spotlight effect cues (de Koning, Tabbers, Rikers, & Pass, 2010). Lin & Atkinson (2011) used arrows as visual cues 
and claimed that learners displayed more instructional efficiency (retained more concepts but spent less time) than 
uncued peers. Jamet, Gavota, & Quaireau (2008) used a coloring technique as a visual cue and suggested that 
learners performed better with saliently colored material. Some researchers looked for more advanced visual cueing 
designs to improve learning efficiency. For example, in de Koning et.al’s (2010) study, visual cues were designed 
and categorized as small-scale single cue and large-scale multiple cues for learning efficiency comparison. Although 
no clear effects occurred in that experiment, it broadened the possibilities of research issues on the design approach 
of visual cueing. 

This study was inspired by Jamet et al.’s (2008) study where the coloring technique was chosen as the visual cueing 
strategy, and de Koning et al.’s (2010) study of comparisons for diverse cueing scales. We recorded voice narration 
of the text for the e-book and each part of the text was colored to correspond with the narrators’ voice. We then 
designed visual cues as three types: word-based cueing, line-based cueing, and paragraph-based cueing. Each type of 
cueing addresses a different length of textual scale to be colored in sync with the narration. 

2.3 Control Design 

The term learner control is used to describe if learners have multiple ways of interacting with instructional materials, 
such as sequencing, selection of content, representation, and pacing (Scheiter & Gerjets, 2007). In this study, it refers 
to the design of reading options, enabling the buttons to go forward, backward or restart. Some researchers have 
argued that a learner-controlled program increases learning outcomes because of its adaptation to preferences and 
cognitive needs, and its affordances for constructive and deeper information processing (Merrill, 1980; Patterson, 
2000; Scheiter & Gerjets, 2007). However, some researchers have claimed that the learner-control mechanism 
benefits adults but is not helpful for young children who are not well-prepared for self-learning (Johnson, Perry, and 
Shamir, 2010). Studies show that the control design may mediate the outcomes of learning integration of audio and 
visual sources. Ginn’s meta-analysis of the modality effect (2005), which targeted forty-two cases of relevant 
research, argued that learners’ integration of auditory and visual sources is moderated by learner control (eg. pacing 
of the presentation). That is, learners’ integration of two information sources was different in studies where the pace 
was set by the system from those studies in which readers controlled their own pace (Ginns, 2005). However, among 
the forty-three cases in Ginns’ study, none targeted elementary learners as participants. The understanding of the 
control design on children’s learning of visual-audio material is limited. 

2.4 Children s Development of Information Processing and Integration 

The effects of audio-visual instruction found in adults may be different from those in children. Theorists have 
proposed that there are age changes in children’s cognitive development and information processing (Bruner, 1964; 
Perlmutter & Myers, 1975; Rohwer, 1970; Shaffer, 2002). Researchers argue that motor representation precedes 
visual representation, which in turn precedes verbal information (including textual and auditory information) (Bruner, 
1964; Rohwer, 1970). It is not until the age of five that visual coding becomes more efficient, and then children start 
to develop the verbal coding ability (Perlmutter & Myers, 1975). During the elementary school stage, children’s 
verbal and visual coding ability continue developing and the memory capacity keeps on expanding. Also, the ability 
of linking visual, textual and audio channels and the strategy to switch and integrate different channels is developing. 
The cognitive development of the human brain is not fully developed until late adolescence and in some males not 
until early adulthood (Case, 1985). Children are not able to fully take advantage of the audio-visual instruction, and 
this might be due to several reasons. First, children may have inadequate attentional control towards learning 
materials (Mann, Schulz, & Cui, 2011). Educational multimedia require children to actively listen and read 
instructional content, and mentally articulate the meaning in the text and audio; however, children may not be able to 
consciously focus their attention to bind the separate stimulus into a unitary object (Matlin, 2002). Second, children 
may have an under-developed phonological loop (Mann, Schulz, & Cui, 2011). The phonological loop as a whole 
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deals with sound or phonological information, and plays a crucial role in the acquisition of the phonological form of 
lexical items (Baddeley, 1986). Adults are able to listen to a sound presented in multimedia and encode the gist 
directly into their phonological store, encoding the details indirectly through their articulatory loop. Children are not 
fully capable of articulating difficult or unfamiliar content presented in text and audio. Their auditory memory 
consists of a phonological store without a phonological loop (Mann, Schulz, & Cui, 2011). Unarticulated material in 
young children is analogous to extraneous cognitive load, and they need more aids to better articulate different 
sources of information (Kalyuga, Chandler, & Sweller, 1999). 

This study selected the fourth and sixth graders as participants because they represent different stages of 
psychological and literacy development in late childhood (Su & Samuels, 2010). The sixth graders, whose age ranges 
from twelve to thirteen, are approaching the formal operational stage (adolescence and into adulthood) with a 
well-developed cognitive ability and phonological loop (Piaget, 1970). The cognitive development in this stage 
becomes stable and individual differences are smaller when compared with the fourth graders. The sixth graders have 
better cognitive ability such as attention, reasoning and organizing, and are more capable of utilizing strategies to 
solve problems in the learning process (Piaget, 1970; Justice, 1985; Shaffer, 2002). 

For the fourth graders, these students have just reached a basic literacy threshold, by which one is just able to read 
individually and independently (Wang, Hung, Chang, Chen, 2008). Based on Chad's model (2003), the fourth 
graders read not only to learn to recognize words, but begin to learn new knowledge, information, thoughts, and 
experiences. Reading in this stage shifts from "learning to read" to "reading to learn." This change requires learners 
to read texts with the inclusion of a more extensive vocabulary, a heavier content load and a need for more 
background knowledge and learning strategies (Chall & Jacobs, 2003). From the developmental psychologist’s 
viewpoint, learners in this grade are in a particular concrete operational stage (Piaget, 1970; Shaffer, 2002). Most 
students still lack of the ability to utilize and activate strategies for information processing and integration (Guttentag, 
Ornstein, & Siemansm 1987; Justice, 1985; Justice, Baker-Ward, Gupta, & Jannings, 1997). Due to the increasingly 
complex reading task but having an underdeveloped cognitive ability as well as phonological loop, students in this 
stage can easily find themselves in a “fourth grade slump,” which refers to the situation when students fall behind in 
reading in their fourth grade (Chall & Jacobs, 2003). The fourth grade slump problem will disappear as the child gets 
older. Theoretically, the needs for the fourth graders are particularly different from those of the sixth graders, who are 
more mature readers and whose reading capabilities and cognitive abilities are more like those of young adults. Due 
to the difference of cognitive and literacy development, the children’s response and need for the audio e-books may 
be different. Thus, this study aims to address this issue and explore the appropriate design approach of audio e-books 
to meet the needs of student in these two grades. 


3. Method 

3.1 Design and Development of an Audio E-book for Six Major Classes of Nutrients 

3.1.1 Program Design and Development 

Adobe Flash CS5.5 Professional and Action Script 3.0 were used to create the audio e-book. This program was 
designed with an 841px * 595px window (A4 size). The screen size made it possible for the installation of the audio 
e-book program on mobile devices, in this study, tablet computers were the primary device used. For the needs of 
this particular study, this audio e-book was developed into several versions using the same story. The following 
treatment section shows snapshots of these designs. 

3.2 Research Questions 

This research aimed to explore the impact of visual cueing and control design on different graders’ reading 
achievement of audio books on tablet computers. This research was a three-way factorial design. The first factor was 
learner grade in two levels: (1) fourth grade, and (2) sixth grade; the second factor was e-book visual cueing in three 
levels: (1) word-based, (2) line-based, and (3) paragraph-based cueing; the third factor was e-book control 
mechanism in two levels: (1) system-controlled, and (2) learner-controlled. The dependent variables include: (1) 
recall scores, and (2) transfer scores. The research questions were: (1) Is there any interaction among learner grade, 
visual cueing and control design in children’s learning recall?; (2) Is there any interaction among learner grade, 
visual cueing and control design in children’s learning transfer? 

3.3 Participants 

The experiment was carried out in the spring semester in 2014. Participants were invited from one elementary school 
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in Taiwan. There were one hundred and eighty-five participants joined this study; eighty-seven were the fourth 
graders and ninety-eight were the sixth graders; eighty-eight (48%) were female and ninety-seven (52%) were male. 
Participants were randomly assigned into groups and the size of each group ranged from 11 to 19 depending on the 
learner grade. 

3.4 Treatment 

As mentioned above, there were several versions of the audio e-book with different levels of visual cueing and 
control, and students were randomly assigned into one of the different treatment groups. As to the visual cueing 
design, there were three types of visual cueing: (1) word-based cueing, (2) line-based cueing, and (3) 
paragraph-based cueing. The three visual cueing versions were designed by taking into consideration that the three 
levels of exposure (word, line, or paragraph) by text highlighting. In each design, the text was colored differently to 
accompany the narrators’ reading. Figure 1 shows snapshots of the visual cueing for the three versions of the audio 
e-books. As to the control design, there were two types of control: (1) system-controlled and (2) learner-controlled. 
The audio e-book with the learner-controlled mechanism allowed learners to navigate backward, forward, and restart, 
whereas the audio e-book with the system-controlled mechanism did not. The audio e-book with the 
system-controlled mechanism only allowed learners to go sequentially with the order set by the system. Figure 2 
shows snapshots of the control design for the two versions of the audio e-books. 



■ Upper left: word-based cueing 

■ Upper right: line-based cueing 

■ Lower left: paragraph-based cueing 

Figure 1. Snapshots of the Visual Cueing for the Versions of the Audio E-Books 
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Left: System-controlled 



Figure 2. Snapshots of the Control Design for Two Versions of the Audio E-Books 

3.5 Instruments 

The achievement tests (consisting of a pretest a posttest) were designed with twelve multiple-choice questions for 
recalling and one matching question for transferring. These two types of questions were used to examine the two 
different cognitive dimensions of learning (recall and transfer). All questions were created to test students’ 
knowledge of the six major classes of nutrients. One sample question for recalling is: What nutrient do milk, eggs, 
fish, and beans contain in common? One sample question for transferring is: Mary was diagnosed by the doctor as 
having symptoms of anemia. Which kind of food do you suggest Mary eat to fight the disease? The pretest and 
posttest were composed of similar questions, comparable in difficulty level and format. On both tests, each student 
had two separate scores (recall and transfer) for achievement. An experienced elementary school teacher and a 
nutritionist verified the test questions for expert validity. 

3.6 Experimental Procedures 

The pretest was conducted two weeks before the experiment to understand learners’ prior knowledge. Then during 
the experiment, we randomly assigned students to one group and ask students to read e-books by individual. Students 
need to complete the reading task on tablet computers along with the posttest in a computer lab. We gave students 
twenty minutes to do the reading task and the whole experiment was in one hour. 


4. Results 

There was no significant between-group difference in any aspect of the student pretest for the fourth and sixth 
graders respectively, which means students in the two grades had equal starting points for learning. However, 
differences occurred in the posttest scores. The descriptive statistics for the pretest and posttest scores are as follows 
(Table 1). 
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Table 1. Descriptive Statistics on Learning Outcome for Each Group 





Recall Pretest 

Transfer Pretest 

Recall Posttest 

Transfer Posttest 

Grade 

Control design 

Visual cueing 

Mean 

SD 

Mean 

SD 

Mean 

SD 

Mean 

SD 


System- 

controlled 

Word-based 

4.67 

1.28 

3.06 

1.00 

6.06 

1.31 

3.50 

1.30 


Line-based 

5.57 

1.79 

3.00 

1.52 

7.00 

1.20 

3.43 

1.28 

A 

Paragraph-based 

5.36 

1.12 

3.00 

1.79 

7.55 

1.44 

4.18 

1.08 


Learner- 

controlled 

Word-based 

4.89 

1.45 

3.11 

1.45 

6.39 

1.72 

3.61 

1.58 


Line-based 

5.00 

1.16 

3.15 

1.35 

6.31 

1.49 

4.15 

1.63 


Paragraph-based 

4.92 

1.19 

3.69 

1.18 

5.77 

1.54 

4.23 

1.09 


System- 

controlled 

Word-based 

4.00 

1.58 

3.65 

1.62 

6.29 

1.86 

4.29 

1.11 


Line-based 

4.20 

2.00 

3.27 

1.22 

7.53 

1.77 

3.93 

1.10 

A ; 

Paragraph-based 

4.89 

1.41 

3.78 

1.52 

6.61 

1.61 

4.17 

1.54 

O 

Learner- 

controlled 

Word-based 

4.26 

1.45 

3.53 

1.93 

6.95 

1.78 

3.74 

2.00 


Line-based 

4.50 

1.61 

3.57 

1.45 

6.21 

2.08 

4.07 

1.44 


Paragraph-based 

4.20 

1.78 

3.87 

1.69 

7.40 

1.88 

4.40 

1.45 


Mean 


4.70 

1.48 

3.38 

1.48 

6.65 

1.66 

3.96 

1.39 


Note. Total score for recall was 10; total score for transfer was 6. 


4.1 Recall scores 

Before conducting the three-way ANOVA analysis to understand interactions, this study did an analysis to understand 
which combination of the visual design and control design performed best. Six combinations were generated and 
tested in the study (system controlled with word-based cueing, learner controlled with word-based cueing, system 
controlled with line-based cueing, learner controlled with line-based cueing, system controlled with paragraph-based 
cueing, learner controlled with paragraph-based cueing). For the fourth graders, significant difference existed among 
these groups (7 7 (5,81)=2.40, />=.04, a/“=. 13). The post hoc results show that students in the paragraph-based cueing 
group with the system-controlled design performed significantly better than all other groups. For the sixth graders, 
no significant difference existed among the six groups in terms of learner recall (7 7 (5,92)=1.42,/»=,23, t] 2 =.01). 

Subsequently a three-way ANOVA was conducted to understand interactions among the factors. The results showed 
significant interactions among grade, visual cueing and learner control (F=3. 36, p <.05, >/~=.04). To explore the 
nature of the interactions, tests of the simple interaction effects were performed (Table 2). In the simple interaction 
effect analysis, the results show the following significant interactions: 1) Interactions between visual cueing and 
control design existed among the fourth grade students (F=3.69,p=. 03, if .08); 2) Interactions between learner grade 
and control design existed among students using the paragraph-based cueing e-books {F=%A\,p=.Q\, if . 14). 

Table 2. Results of Simple Interaction Effect 


Source of Variance 

SS 

df 

MS 

F 

P 

if 

Visual cueing * Learner control 

Within fourth graders 

16.114 

2 

8.057 

3.69 

.03* 

.08 

Within sixth graders 

21.21 

2 

10.60 

3.18 

.05 

.07 

Grade * visual cueing 

Within system-controlled group 

8.51 

2 

4.26 

1.72 

.19 

.04 

Within self-controlled group 

10.46 

2 

5.23 

1.68 

.19 

.04 

Grade* learner control 

Within word-based group 

.46 

1 

.46 

.16 

.69 

.00 

Within line-based group 

1.37 

1 

1.37 

.48 

.49 

.01 

Within paragraph-based group 

22.69 

1 

22.69 

8.41 

.01* 

.14 


*<.05 


Further analysis of the two interactions was performed. As to the interaction between visual cueing and control 
design which existed among the fourth graders, the results of the further analysis are presented in Table 3. In the 
fourth grade group, the students in the paragraph-based cueing design group had significantly better scores than 
students in the word-based cueing design group (F= 4.61, p=. 02, if = .19). In addition, for the same fourth grader 
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group, while reading an e-book with paragraph-based cueing design, students with the system-controlled mechanism 
had significant better scores than students with the learner-controlled mechanism (F=8.43, p=.0\, rf .28). 

Table 3. Results of the Visual Cueing * Control Design Interaction for the Fourth Grade Group 


Source of Variance 

SS 

df 

MS 

F 

P 

>f 

Post Hoc 

Control 

In system controlled 

16.51 

2 

8.26 

4.61 

. 02 * 

.19 

Paragraph>word 

In learner controlled 

3.19 

2 

1.60 

.62 

.54 

.03 


Visual cueing 

In word-based group 

1.00 

1 

1.00 

.43 

.52 

.01 


In line-based group 

3.23 

1 

3.23 

1.66 

.21 

.06 


In paragraph-based group 

18.80 

1 

18.80 

8.43 

. 01 * 

.28 

System controlled> 


learner controlled 


*<.05 

In the paragraph-based cueing group, the sixth graders had significantly better scores than the fourth graders when 
the e-book was learner-controlled (F=6.18, p=.Q2, if . 19). In addition, for the same paragraph-based cueing group, 
the fourth graders had significantly better scores when the e-book was system-controlled compared to 
learner-controlled (F=$A3,p=. 01, rf = . 28) (Table 4). 

Table 4. Results of the Grade * Control Design Interaction for the Paragraph-Based Cueing E-Book Group 


Source of Variabce 

SS 

df 

MS 

F 

P 

2 

n 

Post Hoc 

Grade 

Within fourth graders 

49.04 

1 

18.80 

8.43 

.01* 

.28 

System controlled> 

Within sixth graders 

5.09 

1 

5.09 

1.68 

.20 

.05 

learner controlled 

Control 

Within system controlled 

5.96 

1 

5.96 

2.48 

.13 

.08 


Within learner controlled 

18.52 

1 

18.52 

6.18 

.02* 

.19 

sixth > fourth 


<.05 

4.2 Transfer Scores 

For the transfer scores, the ANOVA showed that the three-way interaction among grade, visual cueing and control 
design was not significant (F=.45, p >. 05). In addition, there existed no significant two-way interaction effects 
among these factors. 

5. Discussion 

This study explored the interactions among visual cueing, control design and children’s grade level on the reading of 
audio e-books on tablet computers in terms of learning achievement. The results showed that interactions existed in 
learner recall but not in learner transfer. The insignificance of transfer scores might be associated with the short 
learning time in the experiment (only twenty minutes). Learners had a limited time slot to absorb new knowledge and 
engage themselves in higher linkage, integration and advanced application. Regarding the recall scores analyzed in 
the one-way or three-way ANOVA, the study had two important findings: (1) For the fourth graders, the combination 
of paragraph-based cueing and system-controlled mechanism best enhanced their recall; (2) The cueing and control 
design did not significantly impact the sixth graders’ recall at all. 

In terms of the cueing design, for young learners (the fourth graders) who are in the concrete operation stage as well 
as the “fourth -grade slump” phase, the word-based cueing design might generate more cognitive load due to learners’ 
under-developed phonological loop, limited memory capacity, and restrained information linkage and processing 
strategy. In the word-based cueing design, the audio and the text were presented exactly at the same time, and 
learners had to listen to the audio of each word and watch texts simultaneously. Students spent all their efforts 
matching sounds to words, but understood nothing of the delivered content. The information delivered was not truly 
articulated and meaningful to learners during the reading process. In case of the paragraph-based cueing design, 
students did not have to catch textual information word by word. They could read the words at their own pace, and 
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the visual marking became a reminder to tell learners where the narrator was. Moreover, in the paragraph-based 
cueing design, it was possible for learners to rely on only visual information or auditory information since the 
narration appeared after the text turned color. Since the text and audio were not shown at the exact same time, 
learners may have taken advantage of only one of the channels. For the design of paragraph-based visual cueing, the 
non-concurrent presentation of identical information sources did not exceed working memory capacity, and thus 
increased the effectiveness of learning. The visual cueing effect disappeared as learner age increased. In this study, 
the cueing design did not make any significant difference to the sixth graders. The sixth graders had a better 
phonological loop, better learning strategies, and even more background knowledge to integrate information from 
different sources, and consequently the cueing design might no longer have played a critical role in their reading of 
audio e-books. 

In terms of the controlled design, this study shows that the fourth graders had better recalling scores with the 
system-controlled mechanism, but the sixth graders performed equally in all conditions. The audio e-book with the 
learner-controlled mechanism allowed learners to navigate in a flexible way using the backward, forward, and restart 
buttons, whereas students in the system-controlled group could only read the audio e-book in a linear way and they 
had to move on with the order and pace of the system. Several arguments claim that interactive learner-controlled 
environments may lead adults to better results, including adaptation to preferences and cognitive needs, and 
affordances for constructive and deeper information processing (Merrill, 1980; Patterson, 2000; Scheiter & Gerjets, 
2007). However, an increasing number of studies suggest that the learner-controlled mechanism was not that helpful 
for elementary learners who have little experience with digital learning environments and reading tutorials on 
computers (Johnson, Perry, & Shamir, 2010; Schwarz, Anderson, Hong, Howard, & MaGee, 2004; Wang, & Yang, 
2014). Wang & Yang’s (2014) study shows that for 10-year-old elementary learners (the fourth graders), the use of a 
simple linear sequence is still better than an interactive learner-controlled mechanism in a digital reading 
environment. Young learners tend to benefit from a comparatively simple, non-linear and system-control design due 
to the lack of attentional control and meta-cognitive abilities (de Jong, 2004; Pearman & Chang, 2010; Wang, 2014), 
and the benefit of the system-controlled mechanism might change as they increase in age and mature cognitively and 
psychologically. In this study, the advantage of the system-controlled environment did not exist for the sixth grade 
students. The age of the sixth graders ranged from eleven to twelve, which is in Piaget’s formal operations stage. 
Learners in this stage were more cognitively well-developed and capable of doing abstract thinking, reasoning and 
conducting information linkage (Piaget, 1970; Justice, 1985; Shaffer, 2002). Also, the problem-solving ability and 
learning strategies used were comparatively better for the sixth graders than the fourth graders (Piaget, 1970; Shaffer, 
2002). The sixth graders were more able to take advantage of the forward, backward, and replay functions to 
facilitate information integration. With children’s increasing maturity in cognitive, psychological and language 
development, the role of the learner-control mechanism becomes comparatively important in the reading process. 

The needs for the fourth graders are different from those of the sixth graders who are more mature readers and whose 
reading capabilities and abilities in their native language (Chinese) are more like those of young adults, and this 
explains why the fourth graders and sixth graders performed differently in certain groups. The educators and 
researchers should take learner cognitive development into consideration, especially for young children, while 
designing audio e-books. 


6. Conclusion 

The results showed interactions among visual cueing, control design, and learner grade on learner recall. The fourth 
graders found the paragraph-based cueing and system-controlled mechanism advantageous. Different from the fourth 
graders’ better performance with a system-controlled mechanism, the sixth graders performed equally well under all 
different interface conditions. 

This study was restricted by the following limitations. First, this study targeted six major classes of nutrients as the 
content matter. The knowledge nature of this unit is basically factual and conceptual instead of procedural or 
metacognitive. High level of mental operation or reasoning is not necessary in this case. Consequently, the results 
might be different if the content matter is changed to other subjects with different cognitive levels. Second, the 
research was carried out in a computer laboratory with only a short period of timeslot, students’ reading motivation 
and performance might be dissimilar from that in a real classroom learning environment. Although these limitations 
may not allow over generalization to broad processes of learning in other subjects, reporting this work is still 
valuable due to the practical suggestions to children’s audio e-book designs and its attempt to expand multimedia 
learning theory. 
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Future research might address the exploration of the effect in a real-world classroom with instructional strategies 
instead of in a laboratory, as well as the revision of current instruments and creation of new measurement to better 
understand learner responses to different designs of audio e-books. In addition, the use of eye-trackers can be 
considered for future study to better explore learners’ visual processing of texts, and qualitative analysis will be 
conducted to seek in-depth understanding of learner behavior and learning outcomes. 

In summary, to make audio e-books work better, the learner’s cognitive capability must be considered. Children at 
different grade levels may need different ways of visual cueing design as well as control design, and instructors need 
to assess whether the audio e-book is accessible for young learners at different developmental stages. To advance the 
understanding of the affordances and constraints of text-to-speech features for children’s learning, relevant empirical 
studies on multimedia design are needed to substantially expand our knowledge on the instructional promises and 
perils. 
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